111 research outputs found

    VM-MAD: a cloud/cluster software for service-oriented academic environments

    Full text link
    The availability of powerful computing hardware in IaaS clouds makes cloud computing attractive also for computational workloads that were up to now almost exclusively run on HPC clusters. In this paper we present the VM-MAD Orchestrator software: an open source framework for cloudbursting Linux-based HPC clusters into IaaS clouds but also computational grids. The Orchestrator is completely modular, allowing flexible configurations of cloudbursting policies. It can be used with any batch system or cloud infrastructure, dynamically extending the cluster when needed. A distinctive feature of our framework is that the policies can be tested and tuned in a simulation mode based on historical or synthetic cluster accounting data. In the paper we also describe how the VM-MAD Orchestrator was used in a production environment at the FGCZ to speed up the analysis of mass spectrometry-based protein data by cloudbursting to the Amazon EC2. The advantages of this hybrid system are shown with a large evaluation run using about hundred large EC2 nodes.Comment: 16 pages, 5 figures. Accepted at the International Supercomputing Conference ISC13, June 17--20 Leipzig, German

    Grid Data Management in Action: Experience in Running and Supporting Data Management Services in the EU DataGrid Project

    Full text link
    In the first phase of the EU DataGrid (EDG) project, a Data Management System has been implemented and provided for deployment. The components of the current EDG Testbed are: a prototype of a Replica Manager Service built around the basic services provided by Globus, a centralised Replica Catalogue to store information about physical locations of files, and the Grid Data Mirroring Package (GDMP) that is widely used in various HEP collaborations in Europe and the US for data mirroring. During this year these services have been refined and made more robust so that they are fit to be used in a pre-production environment. Application users have been using this first release of the Data Management Services for more than a year. In the paper we present the components and their interaction, our implementation and experience as well as the feedback received from our user communities. We have resolved not only issues regarding integration with other EDG service components but also many of the interoperability issues with components of our partner projects in Europe and the U.S. The paper concludes with the basic lessons learned during this operation. These conclusions provide the motivation for the architecture of the next generation of Data Management Services that will be deployed in EDG during 2003.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 9 pages, LaTeX, PSN: TUAT007 all figures are in the directory "figures

    GridCertLib: A Single Sign-on Solution for Grid Web Applications and Portals

    Get PDF
    This paper describes the design and implementation of GridCertLib, a Java library leveraging a Shibboleth-based authentication infrastructure and the SLCS online certificate signing service, to provide short-lived X.509 certificates and Grid proxies. The main use case envisioned for GridCertLib, is to provide seamless and secure access to Grid X.509 certificates and proxies in web applications and portals: when a user logs in to the portal using SAML-based Shibboleth authentication, GridCertLib uses the SAML assertion to obtain a Grid X.509 certificate from the SLCS service and generate a VOMS proxy from it. We give an overview of the architecture of GridCertLib and briefly describe its programming model. Its application to some deployment scenarios is outlined, as well as a report on practical experience integrating GridCertLib into portals for Bioinformatics and Computational Chemistry applications, based on the popular P-GRADE and Django software

    Data Mining the SDSS SkyServer Database

    Full text link
    An earlier paper (Szalay et. al. "Designing and Mining MultiTerabyte Astronomy Archives: The Sloan Digital Sky Survey," ACM SIGMOD 2000) described the Sloan Digital Sky Survey's (SDSS) data management needs by defining twenty database queries and twelve data visualization tasks that a good data management system should support. We built a database and interfaces to support both the query load and also a website for ad-hoc access. This paper reports on the database design, describes the data loading pipeline, and reports on the query implementation and performance. The queries typically translated to a single SQL statement. Most queries run in less than 20 seconds, allowing scientists to interactively explore the database. This paper is an in-depth tour of those queries. Readers should first have studied the companion overview paper Szalay et. al. "The SDSS SkyServer, Public Access to the Sloan Digital Sky Server Data" ACM SIGMOND 2002.Comment: 40 pages, Original source is at http://research.microsoft.com/~gray/Papers/MSR_TR_O2_01_20_queries.do

    Properties of Renormalization Group Transformations

    Full text link
    We describe some properties of Renormalization Group transformations. Especially we show why some of the RG transformations have redundant eigenoperators with eigenvalues that cannot be determined by simple dimensional analysis and give the corresponding formulae.Comment: 13 pages, 5 figure

    The Sloan Digital Sky Survey Science Archive: Migrating a Multi-Terabyte Astronomical Archive from Object to Relational DBMS

    Full text link
    The Sloan Digital Sky Survey Science Archive is the first in a series of multi-Terabyte digital archives in Astronomy and other data-intensive sciences. To facilitate data mining in the SDSS archive, we adapted a commercial database engine and built specialized tools on top of it. Originally we chose an object-oriented database management system due to its data organization capabilities, platform independence, query performance and conceptual fit to the data. However, after using the object database for the first couple of years of the project, it soon began to fall short in terms of its query support and data mining performance. This was as much due to the inability of the database vendor to respond our demands for features and bug fixes as it was due to their failure to keep up with the rapid improvements in hardware performance, particularly faster RAID disk systems. In the end, we were forced to abandon the object database and migrate our data to a relational database. We describe below the technical issues that we faced with the object database and how and why we migrated to relational technology
    • …
    corecore